Objetivo

Predecir el consumo para los 3 meses siguiente dada una serie de datos del consumo previo junto a variables exogenas. Las variables next_consume, next_2_consume y next_3_consume son las variables dependientes que queremos predecir.

Cargar Bibliotecas

install.packages("readr")
Error in install.packages : Updating loaded packages
install.packages("ranger")
Error in install.packages : Updating loaded packages
install.packages("dplyr")
Error in install.packages : Updating loaded packages
install.packages("skimr")
Error in install.packages : Updating loaded packages
install.packages("caret")
Error in install.packages : Updating loaded packages
library(readr) # para leeer el dataset
library(ranger) # random forest con esteroides
library(dplyr) # para manipular datos
library(skimr) # para mirar los datos
library(caret) # framework de machine learning

Cargar el dataset

# dataset %>% select(Date)
# dataset %>% names() %>% as.data.frame()
 skimr::skim(dataset)# %>% knitr::kable() %>% kable_styling(font_size = 9)
── Data Summary ────────────────────────
                           Values 
Name                       dataset
Number of rows             532    
Number of columns          648    
_______________________           
Column type frequency:            
  character                1      
  numeric                  647    
________________________          
Group variables            None   

La Metodologia


dataset <- dataset %>% tidyr::drop_na()
  # mirando los mismos datos, predicen parecido para los proximos 3 meses.
train<-dataset %>% sample_frac(0.8)
test <-setdiff(dataset,train)
train
test

Modelo para predecir next_consume

rf_model1 <- ranger(tmg ~ . ,data=train)
rf_model1
Ranger result

Call:
 ranger(tmg ~ ., data = train) 

Type:                             Regression 
Number of trees:                  500 
Sample size:                      426 
Number of independent variables:  647 
Mtry:                             25 
Target node size:                 5 
Variable importance mode:         none 
Splitrule:                        variance 
OOB prediction error (MSE):       0.001175333 
R squared (OOB):                  0.8174207 
rf_model1$prediction.error
[1] 0.001175333

Entrenamiento

Out Of Box Sampling.

Los errores MSE y R squared se calculan sobre el OOB. El concepto de OOB está relacionado con el proceso de bootstrapping, que es una técnica de muestreo utilizada en la construcción de los árboles de decisión en Random Forest. En bootstrapping, se extrae una muestra aleatoria de los datos de entrenamiento con reemplazo, lo que significa que algunas instancias pueden ser elegidas varias veces, mientras que otras pueden no ser elegidas en absoluto.

Importancia de las variables

rf_model1 <- ranger(tmg ~ . ,data=train, importance = "impurity")
rf_model1
Ranger result

Call:
 ranger(tmg ~ ., data = train, importance = "impurity") 

Type:                             Regression 
Number of trees:                  500 
Sample size:                      426 
Number of independent variables:  647 
Mtry:                             25 
Target node size:                 5 
Variable importance mode:         impurity 
Splitrule:                        variance 
OOB prediction error (MSE):       0.001174988 
R squared (OOB):                  0.8174742 

impurity: Este es el método predeterminado, que calcula la importancia de una característica basándose en la disminución de la impureza del nodo (por ejemplo, Gini o entropía) cuando una característica se utiliza para dividir en los árboles de decisión. Cuanto mayor sea la disminución de la impureza, más importante se considera la característica.


rf_model1$variable.importance
         name         psd_1         psd_2         psd_3         psd_4         psd_5         psd_6         psd_7         psd_8         psd_9        psd_10        psd_11 
 0.0261748227  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
       psd_12        psd_13        psd_14        psd_15        psd_16        psd_17        psd_18        psd_19        psd_20        psd_21        psd_22        psd_23 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
       psd_24        psd_25        psd_26        psd_27        psd_28        psd_29        psd_30        psd_31        psd_32        psd_33        psd_34        psd_35 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
       psd_36        psd_37        psd_38        psd_39        psd_40        psd_41        psd_42        psd_43        psd_44        psd_45        psd_46        psd_47 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
       psd_48        psd_49        psd_50        psd_51        psd_52        psd_53        psd_54        psd_55        psd_56        psd_57        psd_58        psd_59 
 0.0000000000  0.0000000000  0.0213483469  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0094678994  0.0000000000 
       psd_60       psd11_1       psd12_1       psd22_1       psd11_2       psd12_2       psd22_2       psd11_3       psd12_3       psd22_3       psd11_4       psd12_4 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
      psd22_4       psd11_5       psd12_5       psd22_5       psd11_6       psd12_6       psd22_6       psd11_7       psd12_7       psd22_7       psd11_8       psd12_8 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
      psd22_8       psd11_9       psd12_9       psd22_9      psd11_10      psd12_10      psd22_10      psd11_11      psd12_11      psd22_11      psd11_12      psd12_12 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_12      psd11_13      psd12_13      psd22_13      psd11_14      psd12_14      psd22_14      psd11_15      psd12_15      psd22_15      psd11_16      psd12_16 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_16      psd11_17      psd12_17      psd22_17      psd11_18      psd12_18      psd22_18      psd11_19      psd12_19      psd22_19      psd11_20      psd12_20 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_20      psd11_21      psd12_21      psd22_21      psd11_22      psd12_22      psd22_22      psd11_23      psd12_23      psd22_23      psd11_24      psd12_24 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_24      psd11_25      psd12_25      psd22_25      psd11_26      psd12_26      psd22_26      psd11_27      psd12_27      psd22_27      psd11_28      psd12_28 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_28      psd11_29      psd12_29      psd22_29      psd11_30      psd12_30      psd22_30      psd11_31      psd12_31      psd22_31      psd11_32      psd12_32 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_32      psd11_33      psd12_33      psd22_33      psd11_34      psd12_34      psd22_34      psd11_35      psd12_35      psd22_35      psd11_36      psd12_36 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_36      psd11_37      psd12_37      psd22_37      psd11_38      psd12_38      psd22_38      psd11_39      psd12_39      psd22_39      psd11_40      psd12_40 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_40      psd11_41      psd12_41      psd22_41      psd11_42      psd12_42      psd22_42      psd11_43      psd12_43      psd22_43      psd11_44      psd12_44 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_44      psd11_45      psd12_45      psd22_45      psd11_46      psd12_46      psd22_46      psd11_47      psd12_47      psd22_47      psd11_48      psd12_48 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_48      psd11_49      psd12_49      psd22_49      psd11_50      psd12_50      psd22_50      psd11_51      psd12_51      psd22_51      psd11_52      psd12_52 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0690166942  0.0721274150  0.0576661055  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_52      psd11_53      psd12_53      psd22_53      psd11_54      psd12_54      psd22_54      psd11_55      psd12_55      psd22_55      psd11_56      psd12_56 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_56      psd11_57      psd12_57      psd22_57      psd11_58      psd12_58      psd22_58      psd11_59      psd12_59      psd22_59      psd11_60      psd12_60 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0707740774  0.0774427979  0.0537484588  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
     psd22_60      coordc_1      coordc_2      coordc_3      coordc_4      coordc_5      coordc_6      coordc_7      coordc_8      coordc_9     coordc_10     coordc_11 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0025208550 
    coordc_12     coordc_13     coordc_14     coordc_15     coordc_16     coordc_17     coordc_18     coordc_19     coordc_20     coordc_21     coordc_22     coordc_23 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0071077313  0.0000000000  0.0000000000 
    coordc_24     coordc_25     coordc_26     coordc_27     coordc_28     coordc_29     coordc_30     coordc_31     coordc_32     coordc_33     coordc_34     coordc_35 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
    coordc_36     coordc_37     coordc_38     coordc_39     coordc_40     coordc_41     coordc_42     coordc_43     coordc_44     coordc_45     coordc_46     coordc_47 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0088438379  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
    coordc_48     coordc_49     coordc_50     coordc_51     coordc_52     coordc_53     coordc_54     coordc_55     coordc_56     coordc_57     coordc_58     coordc_59 
 0.0000000000  0.0000000000  0.0000000000  0.0144601102  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
    coordc_60     coordc_61     coordc_62     coordc_63     coordc_64     coordc_65     coordc_66     coordc_67     coordc_68     coordc_69     coordc_70     coordc_71 
 0.0000000000  0.0168602614  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0066789920 
    coordc_72     coordc_73     coordc_74     coordc_75     coordc_76     coordc_77     coordc_78     coordc_79     coordc_80     coordc_81     coordc_82     coordc_83 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0211646435  0.0000000000  0.0000000000 
    coordc_84     coordc_85     coordc_86     coordc_87     coordc_88     coordc_89     coordc_90     coordc_91     coordc_92     coordc_93     coordc_94     coordc_95 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
    coordc_96     coordc_97     coordc_98     coordc_99    coordc_100   coordc_fe_1   coordc_fe_2   coordc_fe_3   coordc_fe_4   coordc_fe_5   coordc_fe_6   coordc_fe_7 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
  coordc_fe_8   coordc_fe_9  coordc_fe_10  coordc_fe_11  coordc_fe_12  coordc_fe_13  coordc_fe_14  coordc_fe_15  coordc_fe_16  coordc_fe_17  coordc_fe_18  coordc_fe_19 
 0.0000000000  0.0000000000  0.0000000000  0.0044248989  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_fe_20  coordc_fe_21  coordc_fe_22  coordc_fe_23  coordc_fe_24  coordc_fe_25  coordc_fe_26  coordc_fe_27  coordc_fe_28  coordc_fe_29  coordc_fe_30  coordc_fe_31 
 0.0000000000  0.0151897063  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_fe_32  coordc_fe_33  coordc_fe_34  coordc_fe_35  coordc_fe_36  coordc_fe_37  coordc_fe_38  coordc_fe_39  coordc_fe_40  coordc_fe_41  coordc_fe_42  coordc_fe_43 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0184656877  0.0000000000  0.0000000000 
 coordc_fe_44  coordc_fe_45  coordc_fe_46  coordc_fe_47  coordc_fe_48  coordc_fe_49  coordc_fe_50  coordc_fe_51  coordc_fe_52  coordc_fe_53  coordc_fe_54  coordc_fe_55 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0145968020  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_fe_56  coordc_fe_57  coordc_fe_58  coordc_fe_59  coordc_fe_60  coordc_fe_61  coordc_fe_62  coordc_fe_63  coordc_fe_64  coordc_fe_65  coordc_fe_66  coordc_fe_67 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0118686704  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_fe_68  coordc_fe_69  coordc_fe_70  coordc_fe_71  coordc_fe_72  coordc_fe_73  coordc_fe_74  coordc_fe_75  coordc_fe_76  coordc_fe_77  coordc_fe_78  coordc_fe_79 
 0.0000000000  0.0000000000  0.0000000000  0.0174568488  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_fe_80  coordc_fe_81  coordc_fe_82  coordc_fe_83  coordc_fe_84  coordc_fe_85  coordc_fe_86  coordc_fe_87  coordc_fe_88  coordc_fe_89  coordc_fe_90  coordc_fe_91 
 0.0000000000  0.0434497086  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_fe_92  coordc_fe_93  coordc_fe_94  coordc_fe_95  coordc_fe_96  coordc_fe_97  coordc_fe_98  coordc_fe_99 coordc_fe_100   coordc_ni_1   coordc_ni_2   coordc_ni_3 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
  coordc_ni_4   coordc_ni_5   coordc_ni_6   coordc_ni_7   coordc_ni_8   coordc_ni_9  coordc_ni_10  coordc_ni_11  coordc_ni_12  coordc_ni_13  coordc_ni_14  coordc_ni_15 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0032470147  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_ni_16  coordc_ni_17  coordc_ni_18  coordc_ni_19  coordc_ni_20  coordc_ni_21  coordc_ni_22  coordc_ni_23  coordc_ni_24  coordc_ni_25  coordc_ni_26  coordc_ni_27 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0128599356  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_ni_28  coordc_ni_29  coordc_ni_30  coordc_ni_31  coordc_ni_32  coordc_ni_33  coordc_ni_34  coordc_ni_35  coordc_ni_36  coordc_ni_37  coordc_ni_38  coordc_ni_39 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_ni_40  coordc_ni_41  coordc_ni_42  coordc_ni_43  coordc_ni_44  coordc_ni_45  coordc_ni_46  coordc_ni_47  coordc_ni_48  coordc_ni_49  coordc_ni_50  coordc_ni_51 
 0.0000000000  0.0169897201  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0098439979 
 coordc_ni_52  coordc_ni_53  coordc_ni_54  coordc_ni_55  coordc_ni_56  coordc_ni_57  coordc_ni_58  coordc_ni_59  coordc_ni_60  coordc_ni_61  coordc_ni_62  coordc_ni_63 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0169251622  0.0000000000  0.0000000000 
 coordc_ni_64  coordc_ni_65  coordc_ni_66  coordc_ni_67  coordc_ni_68  coordc_ni_69  coordc_ni_70  coordc_ni_71  coordc_ni_72  coordc_ni_73  coordc_ni_74  coordc_ni_75 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0120012339  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_ni_76  coordc_ni_77  coordc_ni_78  coordc_ni_79  coordc_ni_80  coordc_ni_81  coordc_ni_82  coordc_ni_83  coordc_ni_84  coordc_ni_85  coordc_ni_86  coordc_ni_87 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0286253719  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
 coordc_ni_88  coordc_ni_89  coordc_ni_90  coordc_ni_91  coordc_ni_92  coordc_ni_93  coordc_ni_94  coordc_ni_95  coordc_ni_96  coordc_ni_97  coordc_ni_98  coordc_ni_99 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
coordc_ni_100         pec_1         pec_2         pec_3         pec_4         pec_5         pec_6         pec_7         pec_8         pec_9        pec_10        pec_11 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000 
       pec_12        pec_13        pec_14        pec_15        pec_16        pec_17        pec_18        pec_19        pec_20        pec_21        pec_22        pec_23 
 0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0000000000  0.0031291087  0.0227018934  0.0321348274  0.0285505033  0.0265709627 
       pec_24        pec_25        pec_26        pec_27        pec_28        pec_29        pec_30        pec_31        pec_32        pec_33        pec_34        pec_35 
 0.0068347184  0.0583014758  0.0128612956  0.0995415681  0.1849249501  0.0640673810  0.0188584171  0.0626721214  0.0837703340  0.0141363145  0.0104518916  0.0114475262 
       pec_36        pec_37        pec_38        pec_39        pec_40        pec_41        pec_42        pec_43        pec_44        pec_45        pec_46        pec_47 
 0.0161978632  0.0202804911  0.0173945538  0.0267245812  0.0343740395  0.0156229130  0.0174097906  0.0067164145  0.0061465131  0.0069234907  0.0123972543  0.0313430392 
       pec_48        pec_49        pec_50        pec_51        pec_52        pec_53        pec_54        pec_55        pec_56        pec_57        pec_58        pec_59 
 0.0173603900  0.0260202814  0.0556116195  0.0699605699  0.0576901226  0.0154313787  0.0125424197  0.0141669358  0.0232782853  0.0185153903  0.0104149366  0.0119663777 
       pec_60        pec_61        pec_62        pec_63        pec_64        pec_65        pec_66        pec_67        pec_68        pec_69        pec_70        pec_71 
 0.0180693435  0.0418654130  0.0801950440  0.0124100320  0.0101589735  0.0081179298  0.0054377203  0.0054112043  0.0059716033  0.0064917542  0.0029708978  0.0161584140 
       pec_72        pec_73        pec_74        pec_75        pec_76        pec_77        pec_78        pec_79        pec_80        pec_81        pec_82        pec_83 
 0.0169298940  0.0041864823  0.0096476878  0.0047093208  0.0059649995  0.0046060194  0.0099199141  0.0029295623  0.0025814292  0.0028573451  0.0029830054  0.0056214248 
       pec_84        pec_85        pec_86        pec_87        pec_88        pec_89        pec_90        pec_91        pec_92        pec_93        pec_94        pec_95 
 0.0048704332  0.0005914519  0.0010063448  0.0067050386  0.0058793785  0.0019469364  0.0007929125  0.0059452211  0.0023318256  0.0017673131  0.0002195631  0.0003543456 
       pec_96        pec_97        pec_98        pec_99       pec_100          fe_s          ni_s          fe_c          ni_c          n_fe          n_ni 
 0.0002307600  0.0005581859  0.0016181635  0.0017680639  0.0003468896  0.0493928509  0.0306639906  0.0401905849  0.0338549970  0.0366818048  0.0311937743 
data.frame(impurity=rf_model1$variable.importance) %>% arrange(desc(impurity))
NA
rf_model1 <- ranger(tmg ~ pec_28 + pec_27 + pec_32 + pec_62 + psd12_58 + psd12_50 + psd11_58 + pec_51 + psd11_50 + pec_29,data=train, importance = "impurity")

rf_model1
Ranger result

Call:
 ranger(tmg ~ pec_28 + pec_27 + pec_32 + pec_62 + psd12_58 + psd12_50 +      psd11_58 + pec_51 + psd11_50 + pec_29, data = train, importance = "impurity") 

Type:                             Regression 
Number of trees:                  500 
Sample size:                      426 
Number of independent variables:  10 
Mtry:                             3 
Target node size:                 5 
Variable importance mode:         impurity 
Splitrule:                        variance 
OOB prediction error (MSE):       0.001202208 
R squared (OOB):                  0.8132459 

Prediccion

predictions1 <- predict(rf_model1,data = test, type='response')
predictions1$predictions %>% as.data.frame

Error MSE en test

mse<-function(act,pred) {mean((act- pred)^2)}

data.frame(pred=predictions1$predictions, act=test$tmg) %>% summarise(mse=mse(act,pred))

Intervalos de prediccion (quantile regression)

En vez de utilizar el promedio se utilzan los cuantiles para tener un intervalo de predicción (Meinshausen, 2006). A la hora de realizar el split, en vez de utilizar MSE o alguna otra metrica de impureza, se utiliza una metrica que tiene en cuenta a los cuantiles . Luego en cada hoja en vez de calcular el promedio, se calculan cuantiles.

rf_model1 <- ranger(tmg ~ . ,data=train, importance = "impurity",quantreg = TRUE)
rf_model1
Ranger result

Call:
 ranger(tmg ~ ., data = train, importance = "impurity", quantreg = TRUE) 

Type:                             Regression 
Number of trees:                  500 
Sample size:                      426 
Number of independent variables:  647 
Mtry:                             25 
Target node size:                 5 
Variable importance mode:         impurity 
Splitrule:                        variance 
OOB prediction error (MSE):       0.001160973 
R squared (OOB):                  0.8196514 
predictions1 <- predict(rf_model1,data = test, type= "quantiles")
predictions1$predictions %>% as.data.frame()
NA
NA
p1<-data.frame(predictions1$predictions,act=test$tmg,label="Mag prediction")

p1  %>% ggplot()+
  geom_point(aes(x=act,y=quantile..0.5),color='red')+
  geom_errorbar(aes(x=act,y=quantile..0.5,ymax=quantile..0.9,ymin=quantile..0.1),color='orange')+
  theme_classic()

Intervalo de confianza

Similar al intervalo de prediccion. En las implementaciones de Random Forests, suele confudirse. En terminos generales, uno se aplica sobre una observacion/predicción en general, mientras que el otro trata sobre estadisticos. Un ejemplo seria, la diferencia entre desviacion estandar de una variable y el error estandar sobre un conjunto de muestras.

If we change the training dataset just a little bit, will Random Forest give you the same result for that particular example?

(Wagner et al. 2014) Basado en una tecnica que se llama jacknife.

rf_model1 <- ranger(tmg ~ ., data=train, importance = "impurity",keep.inbag  = TRUE)
rf_model1
Ranger result

Call:
 ranger(tmg ~ ., data = train, importance = "impurity", keep.inbag = TRUE) 

Type:                             Regression 
Number of trees:                  500 
Sample size:                      426 
Number of independent variables:  647 
Mtry:                             25 
Target node size:                 5 
Variable importance mode:         impurity 
Splitrule:                        variance 
OOB prediction error (MSE):       0.001156417 
R squared (OOB):                  0.8203593 
predictions1 <- predict(rf_model1,data = train, type= "se")

predictions1$predictions %>% as.data.frame()
predictions1$se %>% as.data.frame()

data.frame(pred=predictions1$predictions, se= predictions1$se, act=train$tmg) %>% ggplot()+
  geom_point(aes(x=act,y=pred),color='red')+
  geom_errorbar(aes(x=act,y=pred,ymax=pred+se,ymin=pred-se),color='orange')+
  theme_classic()

LS0tCnRpdGxlOiAiUmFuZG9tIEZvcmVzdHM6IFNpcnZlIHBhcmEgdG9kbz8iCm91dHB1dDogaHRtbF9ub3RlYm9vawpkYXRlOiAnMjAyMy0wNC0yNycKLS0tCgpgYGB7ciBzZXR1cCwgaW5jbHVkZT1GQUxTRX0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBUUlVFKQpgYGAKCiMgT2JqZXRpdm8KClByZWRlY2lyIGVsIGNvbnN1bW8gcGFyYSBsb3MgMyBtZXNlcyBzaWd1aWVudGUgZGFkYSB1bmEgc2VyaWUgZGUgZGF0b3MgZGVsIGNvbnN1bW8gcHJldmlvIGp1bnRvIGEgdmFyaWFibGVzIGV4b2dlbmFzLiBMYXMgdmFyaWFibGVzIGBuZXh0X2NvbnN1bWVgLCBgbmV4dF8yX2NvbnN1bWVgIHkgYG5leHRfM19jb25zdW1lYCBzb24gbGFzIHZhcmlhYmxlcyBkZXBlbmRpZW50ZXMgcXVlIHF1ZXJlbW9zIHByZWRlY2lyLgoKIyMgQ2FyZ2FyIEJpYmxpb3RlY2FzCgpgYGB7ciBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQppbnN0YWxsLnBhY2thZ2VzKCJyZWFkciIpCmluc3RhbGwucGFja2FnZXMoInJhbmdlciIpCmluc3RhbGwucGFja2FnZXMoImRwbHlyIikKaW5zdGFsbC5wYWNrYWdlcygic2tpbXIiKQppbnN0YWxsLnBhY2thZ2VzKCJjYXJldCIpCmxpYnJhcnkocmVhZHIpICMgcGFyYSBsZWVlciBlbCBkYXRhc2V0CmxpYnJhcnkocmFuZ2VyKSAjIHJhbmRvbSBmb3Jlc3QgY29uIGVzdGVyb2lkZXMKbGlicmFyeShkcGx5cikgIyBwYXJhIG1hbmlwdWxhciBkYXRvcwpsaWJyYXJ5KHNraW1yKSAjIHBhcmEgbWlyYXIgbG9zIGRhdG9zCmxpYnJhcnkoY2FyZXQpICMgZnJhbWV3b3JrIGRlIG1hY2hpbmUgbGVhcm5pbmcKYGBgCgojIyBDYXJnYXIgZWwgZGF0YXNldAoKYGBge3IgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRSwgaW5jbHVkZT1GQUxTRX0KZGF0YXNldCA8LSByZWFkcjo6cmVhZF9jc3YoImRhdGFzZXQuY3N2IikKIyBkYXRhc2V0IDwtIGRhdGFzZXQgJT4lIG11dGF0ZShEYXRlPSBsdWJyaWRhdGU6OnltKERhdGUpKQojIGRhdGFzZXQgPC0gZGF0YXNldCAlPiUgc2VsZWN0KC1BZ3IuSXRlbSwtRGF0ZSwgLXByZXZfZGF0ZSwgLWAuLi4xYCkKZGF0YXNldApgYGAKCmBgYHtyfQojIGRhdGFzZXQgJT4lIHNlbGVjdChEYXRlKQojIGRhdGFzZXQgJT4lIG5hbWVzKCkgJT4lIGFzLmRhdGEuZnJhbWUoKQpgYGAKCmBgYHtyfQogc2tpbXI6OnNraW0oZGF0YXNldCkjICU+JSBrbml0cjo6a2FibGUoKSAlPiUga2FibGVfc3R5bGluZyhmb250X3NpemUgPSA5KQpgYGAKCiMjIExhIE1ldG9kb2xvZ2lhCgohW10oaHR0cHM6Ly9oYXJwb21heHguZ2l0aHViLmlvL3Bvc3QvMjAyMC0wOS0wOS1leHBlcmltZW50YWwtZGVzaWduLmVuX2ZpbGVzL21sLWV4cGVyaW1lbnRhLWRlc2lnbkEucG5nKXtzdHlsZT0iY29sb3I6d2hpdGUifQoKYGBge3J9CgpkYXRhc2V0IDwtIGRhdGFzZXQgJT4lIHRpZHlyOjpkcm9wX25hKCkKdHJhaW48LWRhdGFzZXQgJT4lIHNhbXBsZV9mcmFjKDAuOCkKdGVzdCA8LXNldGRpZmYoZGF0YXNldCx0cmFpbikKdHJhaW4KdGVzdApgYGAKCiMjIE1vZGVsbyBwYXJhIHByZWRlY2lyIG5leHRfY29uc3VtZQoKYGBge3J9CnJmX21vZGVsMSA8LSByYW5nZXIodG1nIH4gLiAsZGF0YT10cmFpbikKcmZfbW9kZWwxCmBgYAoKYGBge3J9CnJmX21vZGVsMSRwcmVkaWN0aW9uLmVycm9yCmBgYAoKIyMjIEVudHJlbmFtaWVudG8KCiMjIyMgT3V0IE9mIEJveCBTYW1wbGluZy4KCkxvcyBlcnJvcmVzIE1TRSB5IFIgc3F1YXJlZCBzZSBjYWxjdWxhbiBzb2JyZSBlbCBPT0IuIEVsIGNvbmNlcHRvIGRlICoqT09CKiogZXN0w6EgcmVsYWNpb25hZG8gY29uIGVsIHByb2Nlc28gZGUgKipib290c3RyYXBwaW5nKiosIHF1ZSBlcyB1bmEgdMOpY25pY2EgZGUgbXVlc3RyZW8gdXRpbGl6YWRhIGVuIGxhIGNvbnN0cnVjY2nDs24gZGUgbG9zIMOhcmJvbGVzIGRlIGRlY2lzacOzbiBlbiBSYW5kb20gRm9yZXN0LiBFbiBib290c3RyYXBwaW5nLCBzZSBleHRyYWUgdW5hIG11ZXN0cmEgYWxlYXRvcmlhIGRlIGxvcyBkYXRvcyBkZSBlbnRyZW5hbWllbnRvIGNvbiByZWVtcGxhem8sIGxvIHF1ZSBzaWduaWZpY2EgcXVlIGFsZ3VuYXMgaW5zdGFuY2lhcyBwdWVkZW4gc2VyIGVsZWdpZGFzIHZhcmlhcyB2ZWNlcywgbWllbnRyYXMgcXVlIG90cmFzIHB1ZWRlbiBubyBzZXIgZWxlZ2lkYXMgZW4gYWJzb2x1dG8uXApcCgojIyMjIEltcG9ydGFuY2lhIGRlIGxhcyB2YXJpYWJsZXMKCmBgYHtyfQpyZl9tb2RlbDEgPC0gcmFuZ2VyKHRtZyB+IC4gLGRhdGE9dHJhaW4sIGltcG9ydGFuY2UgPSAiaW1wdXJpdHkiKQpyZl9tb2RlbDEKYGBgCgoqKmltcHVyaXR5Kio6IEVzdGUgZXMgZWwgbcOpdG9kbyBwcmVkZXRlcm1pbmFkbywgcXVlIGNhbGN1bGEgbGEgaW1wb3J0YW5jaWEgZGUgdW5hIGNhcmFjdGVyw61zdGljYSBiYXPDoW5kb3NlIGVuIGxhIGRpc21pbnVjacOzbiBkZSBsYSBpbXB1cmV6YSBkZWwgbm9kbyAocG9yIGVqZW1wbG8sIEdpbmkgbyBlbnRyb3DDrWEpIGN1YW5kbyB1bmEgY2FyYWN0ZXLDrXN0aWNhIHNlIHV0aWxpemEgcGFyYSBkaXZpZGlyIGVuIGxvcyDDoXJib2xlcyBkZSBkZWNpc2nDs24uIEN1YW50byAqKm1heW9yIHNlYSBsYSBkaXNtaW51Y2nDs24gZGUgbGEgaW1wdXJlemEsIG3DoXMgaW1wb3J0YW50ZSBzZSBjb25zaWRlcmEgbGEgY2FyYWN0ZXLDrXN0aWNhKiouCgpgYGB7cn0KCnJmX21vZGVsMSR2YXJpYWJsZS5pbXBvcnRhbmNlCmRhdGEuZnJhbWUoaW1wdXJpdHk9cmZfbW9kZWwxJHZhcmlhYmxlLmltcG9ydGFuY2UpICU+JSBhcnJhbmdlKGRlc2MoaW1wdXJpdHkpKQoKYGBgCgpgYGB7cn0KcmZfbW9kZWwxIDwtIHJhbmdlcih0bWcgfiBwZWNfMjggKyBwZWNfMjcgKyBwZWNfMzIgKyBwZWNfNjIgKyBwc2QxMl81OCArIHBzZDEyXzUwICsgcHNkMTFfNTggKyBwZWNfNTEgKyBwc2QxMV81MCArIHBlY18yOSxkYXRhPXRyYWluLCBpbXBvcnRhbmNlID0gImltcHVyaXR5IikKcmZfbW9kZWwxCmBgYAoKIyMjIFByZWRpY2Npb24KCmBgYHtyfQpwcmVkaWN0aW9uczEgPC0gcHJlZGljdChyZl9tb2RlbDEsZGF0YSA9IHRlc3QsIHR5cGU9J3Jlc3BvbnNlJykKcHJlZGljdGlvbnMxJHByZWRpY3Rpb25zICU+JSBhcy5kYXRhLmZyYW1lCmBgYAoKIyMjIyBFcnJvciBNU0UgZW4gdGVzdAoKYGBge3J9Cm1zZTwtZnVuY3Rpb24oYWN0LHByZWQpIHttZWFuKChhY3QtIHByZWQpXjIpfQoKZGF0YS5mcmFtZShwcmVkPXByZWRpY3Rpb25zMSRwcmVkaWN0aW9ucywgYWN0PXRlc3QkdG1nKSAlPiUgc3VtbWFyaXNlKG1zZT1tc2UoYWN0LHByZWQpKQpgYGAKCiMjIyMgSW50ZXJ2YWxvcyBkZSBwcmVkaWNjaW9uIChxdWFudGlsZSByZWdyZXNzaW9uKQoKRW4gdmV6IGRlIHV0aWxpemFyIGVsIHByb21lZGlvIHNlIHV0aWx6YW4gbG9zIGN1YW50aWxlcyBwYXJhIHRlbmVyIHVuIGludGVydmFsbyBkZSBwcmVkaWNjacOzbiAoTWVpbnNoYXVzZW4sIDIwMDYpLiBBIGxhIGhvcmEgZGUgcmVhbGl6YXIgZWwgc3BsaXQsIGVuIHZleiBkZSB1dGlsaXphciBNU0UgbyBhbGd1bmEgb3RyYSBtZXRyaWNhIGRlIGltcHVyZXphLCBzZSB1dGlsaXphIHVuYSBtZXRyaWNhIHF1ZSB0aWVuZSBlbiBjdWVudGEgYSBsb3MgY3VhbnRpbGVzIC4gTHVlZ28gZW4gY2FkYSBob2phIGVuIHZleiBkZSBjYWxjdWxhciBlbCBwcm9tZWRpbywgc2UgY2FsY3VsYW4gY3VhbnRpbGVzLgoKYGBge3J9CnJmX21vZGVsMSA8LSByYW5nZXIodG1nIH4gLiAsZGF0YT10cmFpbiwgaW1wb3J0YW5jZSA9ICJpbXB1cml0eSIscXVhbnRyZWcgPSBUUlVFKQpyZl9tb2RlbDEKYGBgCgpgYGB7cn0KcHJlZGljdGlvbnMxIDwtIHByZWRpY3QocmZfbW9kZWwxLGRhdGEgPSB0ZXN0LCB0eXBlPSAicXVhbnRpbGVzIikKcHJlZGljdGlvbnMxJHByZWRpY3Rpb25zICU+JSBhcy5kYXRhLmZyYW1lKCkKCgpgYGAKCmBgYHtyfQpwMTwtZGF0YS5mcmFtZShwcmVkaWN0aW9uczEkcHJlZGljdGlvbnMsYWN0PXRlc3QkdG1nLGxhYmVsPSJNYWcgcHJlZGljdGlvbiIpCgpwMSAgJT4lIGdncGxvdCgpKwogIGdlb21fcG9pbnQoYWVzKHg9YWN0LHk9cXVhbnRpbGUuLjAuNSksY29sb3I9J3JlZCcpKwogIGdlb21fZXJyb3JiYXIoYWVzKHg9YWN0LHk9cXVhbnRpbGUuLjAuNSx5bWF4PXF1YW50aWxlLi4wLjkseW1pbj1xdWFudGlsZS4uMC4xKSxjb2xvcj0nb3JhbmdlJykrCiAgdGhlbWVfY2xhc3NpYygpCmBgYAoKIyMjIyBJbnRlcnZhbG8gZGUgY29uZmlhbnphCgpTaW1pbGFyIGFsIGludGVydmFsbyBkZSBwcmVkaWNjaW9uLiBFbiBsYXMgaW1wbGVtZW50YWNpb25lcyBkZSBSYW5kb20gRm9yZXN0cywgc3VlbGUgY29uZnVkaXJzZS4gRW4gdGVybWlub3MgZ2VuZXJhbGVzLCB1bm8gc2UgYXBsaWNhIHNvYnJlIHVuYSBvYnNlcnZhY2lvbi9wcmVkaWNjacOzbiBlbiBnZW5lcmFsLCBtaWVudHJhcyBxdWUgZWwgb3RybyB0cmF0YSBzb2JyZSBlc3RhZGlzdGljb3MuIFVuIGVqZW1wbG8gc2VyaWEsIGxhIGRpZmVyZW5jaWEgZW50cmUgZGVzdmlhY2lvbiBlc3RhbmRhciBkZSB1bmEgdmFyaWFibGUgeSBlbCBlcnJvciBlc3RhbmRhciBzb2JyZSB1biBjb25qdW50byBkZSBtdWVzdHJhcy4KCklmIHdlIGNoYW5nZSB0aGUgdHJhaW5pbmcgZGF0YXNldCBqdXN0IGEgbGl0dGxlIGJpdCwgd2lsbCBSYW5kb20gRm9yZXN0IGdpdmUgeW91IHRoZSBzYW1lIHJlc3VsdCBmb3IgdGhhdCBwYXJ0aWN1bGFyIGV4YW1wbGU/CgooV2FnbmVyIGV0IGFsLiAyMDE0KSBCYXNhZG8gZW4gdW5hIHRlY25pY2EgcXVlIHNlIGxsYW1hIGphY2tuaWZlLgoKYGBge3J9CnJmX21vZGVsMSA8LSByYW5nZXIodG1nIH4gLiwgZGF0YT10cmFpbiwgaW1wb3J0YW5jZSA9ICJpbXB1cml0eSIsa2VlcC5pbmJhZyAgPSBUUlVFKQpyZl9tb2RlbDEKYGBgCgpgYGB7cn0KcHJlZGljdGlvbnMxIDwtIHByZWRpY3QocmZfbW9kZWwxLGRhdGEgPSB0cmFpbiwgdHlwZT0gInNlIikKCnByZWRpY3Rpb25zMSRwcmVkaWN0aW9ucyAlPiUgYXMuZGF0YS5mcmFtZSgpCnByZWRpY3Rpb25zMSRzZSAlPiUgYXMuZGF0YS5mcmFtZSgpCgpkYXRhLmZyYW1lKHByZWQ9cHJlZGljdGlvbnMxJHByZWRpY3Rpb25zLCBzZT0gcHJlZGljdGlvbnMxJHNlLCBhY3Q9dHJhaW4kdG1nKSAlPiUgZ2dwbG90KCkrCiAgZ2VvbV9wb2ludChhZXMoeD1hY3QseT1wcmVkKSxjb2xvcj0ncmVkJykrCiAgZ2VvbV9lcnJvcmJhcihhZXMoeD1hY3QseT1wcmVkLHltYXg9cHJlZCtzZSx5bWluPXByZWQtc2UpLGNvbG9yPSdvcmFuZ2UnKSsKICB0aGVtZV9jbGFzc2ljKCkKYGBgCg==